Proposal by Javier Pimás for Memory Manager for SqueakNOS

Proposed by Javier Pimás (profile, biography) Don't forget to submit this proposal to official Google Melange site too!


How will I do that project

Currently squeak doesn't have a memory manager, it depends on the operating system mechanisms to arrange blocks of memory it uses in pages or segments which may be or may be not stored in disk. In SqueakNOS you don't have an underlying OS and the paging and segmentation mechanisms are disabled by now. Actually, SqueakNOS runs in protected mode and sees the whole memory as one segment of the size of the total memory.

As a prerequisite for this job I'll need to have hard-disk support, which I've already covered and is working and in good shape now. A good knowledge of the internals of the virtual machine is required, which I think I have (you can see some posts about it in my website, or look into pharo mailing list archive).

Another place to look is Squeak's garbage collector. Altering memory management will probably request deeply understanding the current implementation and the theory behind it, as we can read in [1]. But I may also be a good place to think of other more intelligent ideas that may complement it, like [3] which improves the GC by offloading some (probably unused) objects to disk.

Also I'll have to think of innovative ways of handling virtual memory, it'd be a shame to do a high level operating system to just resort to the "classic" memory management methods. That's why I refer to LOOM paper [2] as a starting point.

What methodologies will I use

This work will require some Smalltalk and a bit of C programming, both of which I feel very confident. Good design is a must here, we are talking of a memory management scheme designed for Smalltalk, it wouldn't make sense doing it as if it were a direct translation of low level C memory manager. A highlevel language like smalltalk and TDD will help to diminish the costs of this complex job.

Because we are working with the Operating System itself, the best methodology is to work in a virtualized environment, like VMWare and not the real hardware itself.

Suggested timeline and milestones

To complement the knowledge acquired during Advanced Object-Oriented Design classes I took in university I'd start re-reading the referenced papers and implement one.

In parallel, I could already start designing and investigating how to implement the naive scheme. Having a working virtual memory manager would be the first and most important milestone. Then I'll have enough feedback to determine which innovative method could be implemented without exceeding the scope and the time required. Also it will allow to develop a profiling utility to test against the other management method.

Where I see the risks

The classic schemes of paging and segmentation are tryied and tested, but that doesn't mean they are easy to implement, and in a system which runs on top of an Object Engine there may be even more complications. As an example, handling page allocations may be easy in theory, but when you start thinking that now you have an image with objects that may be moved in memory and that many of them make the objects that represent operating system memory manager, which should never be offloaded to disk, then you see that it wasn't so easy to implement.

Other ways of doing memory management may have some other independent complications, which will have to be considered after researching them, so there we'll have to handle the unknowns.

Memory management is not an easy task. Having a bad memory management scheme can ruin operating system's performance, that's why I'll need a profiling tool (and a lot tweaking I think).

How the results will look like

After this GSOC we should have a naive memory manager for SqueakNOS, a profiling utility, and some more interesting memory manager all implemented mostly in Smalltalk. Imagine having a memory manager that you could alter by sending messages and which you may even replace on the fly!

References

[1] Generation Scavenging: A Non-disruptlve High Perfornmance Storage Reclamation Algorithm. http://portal.acm.org/citation.cfm?id=808261

[2] LOOM - large object-oriented memory for Smalltalk-80 systems. http://portal.acm.org/citation.cfm?id=94112

[3] Tolerating Memory Leaks - OOPSLA 2008 - http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.140.2127&rep=rep1&type=pdf




Updated: 15.4.2010